Thread Shadowing: Using Dynamic Redundancy on Hybrid Multi-cores for Error Detection
نویسندگان
چکیده
Dynamic thread duplication is a known redundancy technique for multi-cores. The approach duplicates a thread under observation for some time period and compares the signatures of the two threads to detect errors. Hybrid multi-cores, typically implemented on platform FPGAs, enable the unique option of running the thread under observation and its copy in different modalities, i.e., software and hardware. We denote our dynamic redundancy technique on hybrid multi-cores as thread shadowing. In this paper we present the concept of thread shadowing and an implementation on a multi-threaded hybrid multi-core architecture. We report on experiments with a block-processing application and demonstrate the overheads, detection latencies and coverage for a range of thread shadowing modes. The results show that trans-modal thread shadowing, although bearing long detection latencies, offers attractive coverage at a low overhead.
منابع مشابه
Solving a Redundancy Allocation Problem by a Hybrid Multi-objective Imperialist Competitive Algorithm
A redundancy allocation problem (RAP) is a well-known NP-hard problem that involves the selection of elements and redundancy levels to maximize the system reliability under various system-level constraints. In many practical design situations, reliability apportionment is complicated because of the presence of several conflicting objectives that cannot be combined into a single-objective functi...
متن کاملPower-Efficient Approaches to Reliability
Radiation-induced soft errors (transient faults) in computer systems have increased significantly over the last few years and are expected to increase even more as we move towards smaller transistor sizes and lower supply voltages. Fault detection and recovery can be achieved through redundancy. State-of-the-art implementations execute two copies of the same program as two threads, either on th...
متن کاملParallelism and Performance Comparison of FFT on Multi Core Machines
This work aims to propose various models for algorithm implementation on multi-core processors, and discusses pros and cons of each one. Procedural programming languages like C++ do not tend to fully utilize the multi-core processor as the program has only one thread which runs on one logical CPU. So for N cores, CPU utilization will be (100/N) % only unless the algorithm runs in parallel on di...
متن کاملInter-cluster Thread-to-core Mapping and DVFS on Heterogeneous Multi-cores
Heterogeneous multi-core platforms that contain different types of cores, organized as clusters, are emerging, e.g. ARM’s big.LITTLE architecture. These platforms often need to deal with multiple applications, having different performance requirements, executing concurrently. This leads to generation of varying and mixed workloads (e.g. compute and memory intensive) due to resource sharing. Run...
متن کاملAn approach to fault detection and correction in design of systems using of Turbo codes
We present an approach to design of fault tolerant computing systems. In this paper, a technique is employed that enable the combination of several codes, in order to obtain flexibility in the design of error correcting codes. Code combining techniques are very effective, which one of these codes are turbo codes. The Algorithm-based fault tolerance techniques that to detect errors rely on the c...
متن کامل